Efficiency Considerations for Scalable Information Retrieval Servers
نویسندگان
چکیده
We review a variety of techniques to improve efficiency in information retrieval. Given the increasing volumes of data that are available electronically, understanding and using such techniques is critical. We address several efficiency concerns, but our primary focus is on index processing since it dominates the computational demands of information retrieval. Given the importance of index processing, in addition to a general overview, we include some recent index maintenance results. These results demonstrate that by delaying the updating of the index when additional documents are introduced to the collection, efficiency is improved without noticeably degrading the effectiveness of information retrieval. We conclude with an overview of parallel processing in information retrieval. Since users cannot tolerate lengthy response times, searching large text databases requires vast computational resources. Parallel processing is currently the only means to support these demands. We focus on only those approaches that are currently commercially viable.
منابع مشابه
Improved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملEeciency Considerations for Scalable Information Retrieval Servers
We overview a variety of techniques to improve eeciency in information retrieval. Given the increasing volumes of data that are available electronically, understanding and using such techniques is critical. We address several eeciency concerns, but our primary focus is on index processing since it dominates the computational demands of information retrieval. Given the importance of index proces...
متن کاملBehavioral Considerations in Developing Web Information Systems: User-centered Design Agenda
The current paper explores designing a web information retrieval system regarding the searching behavior of users in real and everyday life. Designing an information system that is closely linked to human behavior is equally important for providers and the end users. From an Information Science point of view, four approaches in designing information retrieval systems were identified as system-...
متن کاملText Based Approaches for Content Based Image Retrieval in a P2P Network
The tremendous growth of digital multimedia content on the web requires scalable, efficient, and effective information retrieval mechanisms. Handling such large collections of data in a centralized way requires costly high bandwidth connectivity and powerful servers. This establishes the need of distributed architectures, such as peer-to-peer systems, that allow sharing of data management and s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Digit. Inf.
دوره 1 شماره
صفحات -
تاریخ انتشار 2000